AITopics

2605.28729

Country:

North America > Canada (0.69)
Europe (0.68)
North America > United States > California (0.16)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMay-13-2026

Explicit integral representations and quantitative bounds for two-layer ReLU networks

Lee, Anthony

An approach to construct explicit integral representations for two-layer ReLU networks is presented, which provides relatively simple representations for any multivariate polynomial. Quantitative bounds are provided for a particular, sharpened ReLU integral representation, which involves a harmonic extension and a projection. The bounds demonstrate that functions can be approximated with $L^{2}(\mathcal{D})$ errors that do not depend explicitly on dimension or degree, but rather the coefficients of their monomial expansions and the distribution $\mathcal{D}$. We also present a connection to the RKHS of the exponential kernel $K(x,y)=\exp\left(\left\langle x,y\right\rangle \right)$, and a very simple integral representation involving additionally multiplication via a fixed function which has better quantitative bounds.

artificial intelligence, machine learning, representation, (18 more...)

2604.2326

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Kratsios, Anastasis, Neuman, A. Martina, Petersen, Philipp

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

arXiv.org Machine LearningMay-7-2026

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.

learner, machine learning, natural language, (19 more...)

2605.04995

Country:

Europe > Austria (0.28)
North America > Canada (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningApr-29-2026

Transformer Approximations from ReLUs

Hu, Jerry Yao-Chieh, Lu, Mingcheng, Lee, Yi-Chen, Liu, Han

We present a systematic recipe for translating ReLU approximation results to softmax Transformers1. Given a constructive ReLU approximator for a target, we construct an explicit softmax transformer with the same accuracy. The recipe applies to many common approximation targets and yields quantitative resource bounds beyond universal approximation statements. This matters because broad Universal Approximation Properties (UAP) still dominate Transformer approximation theory. For softmax Transformer, many universality results provide explicit constructions and quantitative resource bounds (e.g., parameters, depth, width...etc) [Yun et al., 2020, Kajitsuka and Sato, 2023, Takakura and Suzuki, 2023, Jiang and Li, 2024, Hu et al., 2025,

approximation, artificial intelligence, machine learning, (16 more...)

2604.24878

Country:

North America > United States (0.28)
Asia > Taiwan (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsApr-25-2026, 19:40:15 GMT

4d19b37a2c399deace9082d464930022-Paper.pdf

artificial intelligence, machine learning, probability, (17 more...)

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsApr-25-2026, 17:32:41 GMT

Empowering Convolutional Neural Networks with MetaSin Activation

RELU networks have remained the default choice for models in the area of image prediction despite their well-established spectral bias towards learning low frequencies faster, and consequently their difficulty of reproducing high frequency visual details. As an alternative, sinnetworks showed promising results in learning implicit representations of visual data. However training these networks in practically relevant settings proved to be difficult, requiring careful initialization, dealing with issues due to inconsistent gradients, and a degeneracy in local minima. In this work, we instead propose replacing a baseline network's existing activations with a novel ensemble function with trainable parameters. The proposed METASIN activation can be trained reliably without requiring intricate initialization schemes, and results in consistently lower test loss compared to alternatives. We demonstrate our method in the areas of Monte-Carlo denoising and image resampling where we set new state-of-the-art through a knowledge distillation based training procedure. We present ablations on hyper-parameter settings, comparisons with alternative activation function formulations, and discuss the use of our method in other domains, such as image classification.

artificial intelligence, deep learning, machine learning, (19 more...)

Country:

North America > United States (0.46)
North America > Canada > Ontario (0.28)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-25-2026, 13:48:20 GMT

1c26c389d60ec419fd24b5fee5b35796-Supplemental-Conference.pdf

artificial intelligence, machine learning, optimization problem, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Neural Information Processing SystemsApr-25-2026, 13:48:16 GMT

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

In this work, we study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks. We focus on a setting where the data consists of clusters and the correlations between cluster means are small, and show that in two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are vulnerable to adversarial examples. Our results hold even in cases where the network is highly overparameterized. Despite the potential for harmful overfitting in such settings, we prove that the implicit bias of gradient flow prevents it. However, the implicit bias also leads to non-robust solutions (susceptible to small adversarial ℓ2-perturbations), even though robust networks that fit the data exist.

artificial intelligence, machine learning, optimization problem, (18 more...)